Overview

Dataset Statistics

Number of Variables 4
Number of Rows 16000
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 1.7 MB
Average Row Size in Memory 113.8 B
Variable Types
  • Numerical: 3
  • Categorical: 1

Dataset Insights

item_id is skewed Skewed
rating is skewed Skewed
name has a high cardinality: 3303 distinct values High Cardinality
rating has 3022 (18.89%) zeros Zeros

Variables


user_id

numerical

Approximate Distinct Count 12330
Approximate Unique (%) 77.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 256000
Mean 36810.6389
Minimum 7
Maximum 73510
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • user_id is skewed left (γ1 = -0.0248)

Quantile Statistics

Minimum 7
5-th Percentile 4000.95
Q1 18943
Median 37240
Q3 54711.5
95-th Percentile 69122.25
Maximum 73510
Range 73503
IQR 35768.5

Descriptive Statistics

Mean 36810.6389
Standard Deviation 20953.0965
Variance 4.3903e+08
Sum 5.8897e+08
Skewness -0.02476
Kurtosis -1.2017
Coefficient of Variation 0.5692

item_id

numerical

Approximate Distinct Count 3303
Approximate Unique (%) 20.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 256000
Mean 8944.7836
Minimum 1
Maximum 34240
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • item_id is skewed right (γ1 = 0.9938)

Quantile Statistics

Minimum 1
5-th Percentile 130.95
Q1 1265.75
Median 6325
Q3 14227
95-th Percentile 28423
Maximum 34240
Range 34239
IQR 12961.25

Descriptive Statistics

Mean 8944.7836
Standard Deviation 8883.916
Variance 7.8924e+07
Sum 1.4312e+08
Skewness 0.9938
Kurtosis -0.00046112
Coefficient of Variation 0.9932
  • item_id is not normally distributed (p-value 1.0194117601083228e-17)
  • item_id has 5 outliers

rating

numerical

Approximate Distinct Count 11
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 256000
Mean 6.3329
Minimum 0
Maximum 10
Zeros 3022
Zeros (%) 18.9%
Negatives 0
Negatives (%) 0.0%
  • rating is skewed left (γ1 = -1.0219)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 5
Median 7
Q3 9
95-th Percentile 10
Maximum 10
Range 10
IQR 4

Descriptive Statistics

Mean 6.3329
Standard Deviation 3.3644
Variance 11.319
Sum 101327
Skewness -1.0219
Kurtosis -0.3506
Coefficient of Variation 0.5313
  • rating is not normally distributed (p-value 3.947640117223782e-10)

name

categorical

Approximate Distinct Count 3303
Approximate Unique (%) 20.6%
Missing 0
Missing (%) 0.0%
Memory Size 1452635

Length

Mean 23.1572
Standard Deviation 14.3809
Median 19
Minimum 1
Maximum 98

Sample

1st row Naruto
2nd row Naruto
3rd row Naruto
4th row Naruto
5th row Naruto

Letter

Count 311048
Lowercase Letter 259121
Space Separator 44056
Uppercase Letter 51927
Dash Punctuation 1829
Decimal Number 3082
  • name contains many words: 4593 words

Interactions

Correlations

Missing Values